column subset selection algorithm
Towards a Zero-One Law for Column Subset Selection
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-$k$ matrix $B$ minimizing the sum of absolute values of differences to a given $n$-by-$n$ matrix $A$, $\min_{\textrm{rank-}k~B}\|A-B\|_1$, or more generally finding a rank-$k$ matrix $B$ which minimizes the sum of $p$-th powers of absolute values of differences, $\min_{\textrm{rank-}k~B}\|A-B\|_p^p$. Many of these algorithms are linear time columns subset selection algorithms, returning a subset of $\poly(k \log n)$ columns whose cost is no more than a $\poly(k)$ factor larger than the cost of the best rank-$k$ matrix.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- (6 more...)
Towards a Zero-One Law for Column Subset Selection
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank- k matrix B minimizing the sum of absolute values of differences to a given n -by- n matrix A, \min_{\textrm{rank-}k B}\ A-B\ _1, or more generally finding a rank- k matrix B which minimizes the sum of p -th powers of absolute values of differences, \min_{\textrm{rank-}k B}\ A-B\ _p p . Many of these algorithms are linear time columns subset selection algorithms, returning a subset of \poly(k \log n) columns whose cost is no more than a \poly(k) factor larger than the cost of the best rank- k matrix. The above error measures are special cases of the following general entrywise low rank approximation problem: given an arbitrary function g:\mathbb{R} \rightarrow \mathbb{R}_{\geq 0}, find a rank- k matrix B which minimizes \ A-B\ _g \sum_{i,j}g(A_{i,j}-B_{i,j}) . A natural question is which functions g admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models.
Towards a Zero-One Law for Column Subset Selection
Song, Zhao, Woodruff, David, Zhong, Peilin
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-$k$ matrix $B$ minimizing the sum of absolute values of differences to a given $n$-by-$n$ matrix $A$, $\min_{\textrm{rank-}k B}\ A-B\ _1$, or more generally finding a rank-$k$ matrix $B$ which minimizes the sum of $p$-th powers of absolute values of differences, $\min_{\textrm{rank-}k B}\ A-B\ _p p$. Many of these algorithms are linear time columns subset selection algorithms, returning a subset of $\poly(k \log n)$ columns whose cost is no more than a $\poly(k)$ factor larger than the cost of the best rank-$k$ matrix. A natural question is which functions $g$ admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models. In this work we give approximation algorithms for {\it every} function $g$ which is approximately monotone and satisfies an approximate triangle inequality, and we show both of these conditions are necessary.